As nuclear facilities experience digital transformation and advanced reactor development, AI integration, cyber-physical security, and other emerging technologies such as autonomous robot operations are increasingly developed. However, evaluation and deployment is challenged by the lack of dedicated virtual testbeds. The Immersive Framework for Advanced Nuclear (iFAN) ecosystem is developed, a comprehensive digital twin framework with a realistic 3D environment with physics-based simulations. The iFAN ecosystem serves as a high-fidelity virtual testbed for plant operation, cybersecurity, physical security, and robotic operation, as it provides real-time data exchange for pre-deployment verification. Core features include virtual reality, reinforcement learning, radiation simulation, and cyber-physical security. In addition, the paper investigates various applications through potential operational scenarios. The iFAN ecosystem provides a versatile and secure architecture for validating the next generation of autonomous and cyber-resilient nuclear operations.
The prevailing paradigm in AI for physical systems, scaling general-purpose foundation models toward universal multimodal reasoning, confronts a fundamental barrier at the control interface. Recent benchmarks show that even frontier vision-language models achieve only 50-53% accuracy on basic quantitative physics tasks, behaving as approximate guessers that preserve semantic plausibility while violating physical constraints. This input unfaithfulness is not a scaling deficiency but a structural limitation. Perception-centric architectures optimize parameter-space imitation, whereas safety-critical control demands outcome-space guarantees over executed actions. Here, we present a fundamentally different pathway toward domain-specific foundation models by introducing compact language models operating as Agentic Physical AI, in which policy optimization is driven by physics-based validation rather than perceptual inference. We train a 360-million-parameter model on synthetic reactor control scenarios, scaling the dataset from 10^3 to 10^5 examples. This induces a sharp phase transition absent in general-purpose models. Small-scale systems exhibit high-variance imitation with catastrophic tail risk, while large-scale models undergo variance collapse exceeding 500x reduction, stabilizing execution-level behavior. Despite balanced exposure to four actuation families, the model autonomously rejects approximately 70% of the training distribution and concentrates 95% of runtime execution on a single-bank strategy. Learned representations transfer across distinct physics and continuous input modalities without architectural modification.
Designing nuclear reactor cores requires navigating large discrete design spaces governed by complex neutronic interactions. Traditional deterministic, metaheuristic, and machine-learning-assisted methods search within fixed, human-defined configuration spaces, limiting their ability to discover fundamentally new design topologies. Here we introduce ReactorFold, a generative framework that reformulates fuel-assembly design as a sequence modeling problem for language models. Using Monte Carlo data, parameter-efficient fine-tuning, and Direct Preference Optimization (DPO), the model learns the latent structure of a pressurized-water-reactor assembly and generates candidate layouts in a single forward pass. Notably, the DPO-aligned model exhibits emergent design-space expansion: despite being trained exclusively on configurations with a fixed number of gadolinium burnable absorber (Gd) rods, it autonomously adjusts Gd inventory to satisfy strict power-peaking constraints. The model also discovers high-performing asymmetric configurations that challenge conventional symmetric loading heuristics, accessing design regimes inaccessible to conventional search methods and demonstrating that language models can internalize causal physical relationships and transcend human-imposed design constraints.
Shallow Recurrent Decoder networks are a novel data-driven methodology able to provide accurate state estimation in engineering systems, such as nuclear reactors. This deep learning architecture is a robust technique designed to map the temporal trajectories of a few sparse measures to the full state space, including unobservable fields, which is agnostic to sensor positions and able to handle noisy data through an ensemble strategy, leveraging the short training times and without the need for hyperparameter tuning. Following its application to a novel reactor concept, this work investigates the performance of Shallow Recurrent Decoders when applied to a real system. The underlying model is represented by a fluid dynamics model of the TRIGA Mark II research reactor; the architecture will use both synthetic temperature data coming from the numerical model and leveraging experimental temperature data recorded during a previous campaign. The objective of this work is, therefore, two-fold: 1) assessing if the architecture can reconstruct the full state of the system (temperature, velocity, pressure, turbulence quantities) given sparse data located in specific, low-dynamics channels and 2) assessing the correction capabilities of the architecture (that is, given a discrepancy between model and data, assessing if sparse measurements can provide some correction to the architecture output). As will be shown, the accurate reconstruction of every characteristic field, using both synthetic and experimental data, in real-time makes this approach suitable for interpretable monitoring and control purposes in the framework of a reactor digital twin.
The energy consumption and carbon footprint of Artificial Intelligence (AI) have become critical concerns due to rising costs and environmental impacts. In response, a new trend in green AI is emerging, shifting from the "bigger is better" paradigm, which prioritizes large models, to "small is sufficient", emphasizing energy sobriety through smaller, more efficient models. We explore how the AI community can adopt energy sobriety today by focusing on model selection during inference. Model selection consists of choosing the most appropriate model for a given task, a simple and readily applicable method, unlike approaches requiring new hardware or architectures. Our hypothesis is that, as in many industrial activities, marginal utility gains decrease with increasing model size. Thus, applying model selection can significantly reduce energy consumption while maintaining good utility for AI inference. We conduct a systematic study of AI tasks, analyzing their popularity, model size, and efficiency. We examine how the maturity of different tasks and model adoption patterns impact the achievable energy savings, ranging from 1% to 98% for different tasks. Our estimates indicate that applying model selection could reduce AI energy consumption by 27.8%, saving 31.9 TWh worldwide in 2025 - equivalent to the annual output of five nuclear power reactors.




Nuclear fusion plays a pivotal role in the quest for reliable and sustainable energy production. A major roadblock to viable fusion power is understanding plasma turbulence, which significantly impairs plasma confinement, and is vital for next-generation reactor design. Plasma turbulence is governed by the nonlinear gyrokinetic equation, which evolves a 5D distribution function over time. Due to its high computational cost, reduced-order models are often employed in practice to approximate turbulent transport of energy. However, they omit nonlinear effects unique to the full 5D dynamics. To tackle this, we introduce GyroSwin, the first scalable 5D neural surrogate that can model 5D nonlinear gyrokinetic simulations, thereby capturing the physical phenomena neglected by reduced models, while providing accurate estimates of turbulent heat transport.GyroSwin (i) extends hierarchical Vision Transformers to 5D, (ii) introduces cross-attention and integration modules for latent 3D$\leftrightarrow$5D interactions between electrostatic potential fields and the distribution function, and (iii) performs channelwise mode separation inspired by nonlinear physics. We demonstrate that GyroSwin outperforms widely used reduced numerics on heat flux prediction, captures the turbulent energy cascade, and reduces the cost of fully resolved nonlinear gyrokinetics by three orders of magnitude while remaining physically verifiable. GyroSwin shows promising scaling laws, tested up to one billion parameters, paving the way for scalable neural surrogates for gyrokinetic simulations of plasma turbulence.
The nuclear industry is advancing toward more new reactor designs, with next-generation reactors expected to be smaller in scale and power output. These systems have the potential to produce large volumes of information in the form of multivariate time-series data, which could be used for enhanced real-time monitoring and control. In this context, the development of remote autonomous or semi-autonomous control systems for reactor operation has gained significant interest. A critical first step toward such systems is an accurate diagnostics module capable of detecting and localizing anomalies within the reactor system. Recent studies have proposed various ML and DL approaches for anomaly detection in the nuclear domain. Despite promising results, key challenges remain, including limited to no explainability, lack of access to real-world data, and scarcity of abnormal events, which impedes benchmarking and characterization. Most existing studies treat these methods as black boxes, while recent work highlights the need for greater interpretability of ML/DL outputs in safety-critical domains. Here, we propose an unsupervised methodology based on an LSTM autoencoder with a dual attention mechanism for characterization of abnormal events in a real-world reactor radiation area monitoring system. The framework includes not only detection but also localization of the event and was evaluated using real-world datasets of increasing complexity from the PUR-1 research reactor. The attention mechanisms operate in both the feature and temporal dimensions, where the feature attention assigns weights to radiation sensors exhibiting abnormal patterns, while time attention highlights the specific timesteps where irregularities occur, thus enabling localization. By combining the results, the framework can identify both the affected sensors and the duration of each anomaly within a single unified network.
This study proposes a Physics-Informed Neural Network (PINN) framework to predict the low-cycle fatigue (LCF) life of irradiated austenitic and ferritic/martensitic (F/M) steels used in nuclear reactors. These materials experience cyclic loading and irradiation at elevated temperatures, causing complex degradation that traditional empirical models fail to capture accurately. The developed PINN model incorporates physical fatigue life constraints into its loss function, improving prediction accuracy and generalizability. Trained on 495 data points, including both irradiated and unirradiated conditions, the model outperforms traditional machine learning models like Random Forest, Gradient Boosting, eXtreme Gradient Boosting, and the conventional Neural Network. SHapley Additive exPlanations analysis identifies strain amplitude, irradiation dose, and testing temperature as dominant features, each inversely correlated with fatigue life, consistent with physical understanding. PINN captures saturation behaviour in fatigue life at higher strain amplitudes in F/M steels. Overall, the PINN framework offers a reliable and interpretable approach for predicting fatigue life in irradiated alloys, enabling informed alloy selection.
This paper evaluates the cost competitiveness of microreactors in today's electricity markets, with a focus on uncertainties in reactor costs. A Genetic Algorithm (GA) is used to optimize key technical parameters, such as reactor capacity, fuel enrichment, tail enrichment, refueling interval, and discharge burnup, to minimize the Levelized Cost of Energy (LCOE). Base case results are validated using Simulated Annealing (SA). By incorporating Probability Distribution Functions (PDFs) for fuel cycle costs, the study identifies optimal configurations under uncertainty. Methodologically, it introduces a novel framework combining probabilistic cost modeling with evolutionary optimization. Results show that microreactors can remain cost-competitive, with LCOEs ranging from \$48.21/MWh to \$78.32/MWh when supported by the Production Tax Credit (PTC). High reactor capacity, low fuel enrichment, moderate tail enrichment and refueling intervals, and high discharge burnup enhance cost efficiency. Among all factors, overnight capital cost (OCC) has the most significant impact on LCOE, while O&M and fuel cost uncertainties have lesser effects. The analysis highlights how energy policies like the PTC can reduce LCOE by 22-24%, improving viability despite cost variability. Compared to conventional nuclear, coal, and renewable sources like offshore wind, hydro, and biomass, optimized microreactors show strong economic potential. This research defines a realistic design space and key trade-offs, offering actionable insights for policymakers, reactor designers, and energy planners aiming to accelerate the deployment of affordable, sustainable microreactors.
This work presents a methodology for reconstructing the spatial distribution of the neutron flux in a nuclear reactor, leveraging real-time measurements obtained from ex-core detectors. The Kirchhoff-Helmholtz (K-H) equation inherently defines the problem of estimating a scalar field within a domain based on boundary data, making it a natural mathematical framework for this task. The main challenge lies in deriving the Green's function specific to the domain and the neutron diffusion process. While analytical solutions for Green's functions exist for simplified geometries, their derivation of complex, heterogeneous domains-such as a nuclear reactor-requires a numerical approach. The objective of this work is to demonstrate the well-posedness of the data-driven Green's function approximation by formulating and solving the K-H equation as an inverse problem. After establishing the symmetry properties that the Green's function must satisfy, the K-H equation is derived from the one-speed neutron diffusion model. This is followed by a comprehensive description of the procedure for interpreting sensor readings and implementing the neutron flux reconstruction algorithm. Finally, the existence and uniqueness of the Green's function inferred from the sampled data are demonstrated, ensuring the reliability of the proposed method and its predictions.